Optimizing Classifier Performance in Word Sense Disambiguation by Redefining Sense Classes

نویسندگان

  • Upali Sathyajith Kohomban
  • Wee Sun Lee
چکیده

Learning word sense classes has been shown to be useful in fine-grained word sense disambiguation [Kohomban and Lee, 2005]. However, the common choice for sense classes, WordNet lexicographer files, are not designed for machine learning based word sense disambiguation. In this work, we explore the use of clustering techniques in an effort to construct sense classes that are more suitable for word sense disambiguation end-task. Our results show that these classes can significantly improve classifier performance over the state of the art results of unrestricted word sense disambiguation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing feature set for Chinese Word Sense Disambiguation

This article describes the implementation of I2R word sense disambiguation system (I2R −WSD) that participated in one senseval3 task: Chinese lexical sample task. Our core algorithm is a supervised Naive Bayes classifier. This classifier utilizes an optimal feature set, which is determined by maximizing the cross validated accuracy of NB classifier on training data. The optimal feature set incl...

متن کامل

Augmented Mixture Models for Lexical Disambiguation

This paper investigates several augmented mixture models that are competitive alternatives to standard Bayesian models and prove to be very suitable to word sense disambiguation and related classification tasks. We present a new classification correction technique that successfully addresses the problem of under-estimation of infrequent classes in the training data. We show that the mixture mod...

متن کامل

Modeling Consensus: Classifier Combination for Word Sense Disambiguation

This paper demonstrates the substantial empirical success of classifier combination for the word sense disambiguation task. It investigates more than 10 classifier combination methods, including second order classifier stacking, over 6 major structurally different base classifiers (enhanced Naïve Bayes, cosine, Bayes Ratio, decision lists, transformationbased learning and maximum variance boost...

متن کامل

Trajectory Based Word Sense Disambiguation

Classifier combination is a promising way to improve performance of word sense disambiguation. We propose a new combinational method in this paper. We first construct a series of Naïve Bayesian classifiers along a sequence of orderly varying sized windows of context, and perform sense selection for both training samples and test samples using these classifiers. We thus get a sense selection tra...

متن کامل

Automatic Word Sense Disambiguation (wsd) System

This paper presents an automatic word sense disambiguation (WSD) system that uses Part-of-Speech (POS) tags along with word classes as the discrete features. Word Classes are derived from the Word Class Assigner using the Word Exchange Algorithm from statistical language processing. Naïve-Bayes classifier is employed from Weka in both the training and testing phases to perform the supervised le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007